Gene coexpression measures in large heterogeneous samples using count statistics.
نویسندگان
چکیده
With the advent of high-throughput technologies making large-scale gene expression data readily available, developing appropriate computational tools to process these data and distill insights into systems biology has been an important part of the "big data" challenge. Gene coexpression is one of the earliest techniques developed that is still widely in use for functional annotation, pathway analysis, and, most importantly, the reconstruction of gene regulatory networks, based on gene expression data. However, most coexpression measures do not specifically account for local features in expression profiles. For example, it is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from a range of experiments. We propose two new gene coexpression statistics based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. In particular, one of our statistics is designed for time-course data with local dependence structures, such as time series coupled over a subregion of the time domain. We provide asymptotic analysis of their distributions and power, and evaluate their performance against a wide range of existing coexpression measures on simulated and real data. Our new statistics are fast to compute, robust against outliers, and show comparable and often better general performance.
منابع مشابه
Subspace Differential Coexpression Analysis: Problem Definition and a General Approach
In this paper, we study methods to identify differential coexpression patterns in case-control gene expression data. A differential coexpression pattern consists of a set of genes that have substantially different levels of coherence of their expression profiles across the two sample-classes, i.e., highly coherent in one class, but not in the other. Biologically, a differential coexpression pat...
متن کاملRank of Correlation Coefficient as a Comparable Measure for Biological Significance of Gene Coexpression
Information regarding gene coexpression is useful to predict gene function. Several databases have been constructed for gene coexpression in model organisms based on a large amount of publicly available gene expression data measured by GeneChip platforms. In these databases, Pearson's correlation coefficients (PCCs) of gene expression patterns are widely used as a measure of gene coexpression. ...
متن کاملCosplicing network analysis of mammalian brain RNA-Seq data utilizing WGCNA and Mantel correlations
Across species and tissues and especially in the mammalian brain, production of gene isoforms is widespread. While gene expression coordination has been previously described as a scale-free coexpression network, the properties of transcriptome-wide isoform production coordination have been less studied. Here we evaluate the system-level properties of cosplicing in mouse, macaque, and human brai...
متن کاملLatent Domain Word Alignment for Heterogeneous Corpora
This work focuses on the insensitivity of existing word alignment models to domain differences, which often yields suboptimal results on large heterogeneous data. A novel latent domain word alignment model is proposed, which induces domain-conditioned lexical and alignment statistics. We propose to train the model on a heterogeneous corpus under partial supervision, using a small number of seed...
متن کاملStudy of association between rs7975232 polymorphism in vitamin D receptor gene and periodontitis by Tetra Arms PCR
Background and Aims: Periodontitis is an inflammatory multifactorial disease in oral tissues and many genetic reasons and environmental factors responsible. Vitamin D deficiency has been determined to be related to periodontal disease. This aim of this study was to investigate the association between rs7975232 polymorphism in vitamin D Receptor gene and periodontitis in 100 patients (as patient...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 111 46 شماره
صفحات -
تاریخ انتشار 2014